Vertex Intrinsic Fitness: How to Produce Arbitrary Scale-Free Networks 
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We study a recent model of random networks based on the presence of an intrinsic character of the 
vertices called fitness. The vertices fitnesses are drawn from a given probability distribution density. 
The edges between pair of vertices are drawn according to a linking probability function depending 
on the fitnesses of the two vertices involved. We study here different choices for the probability 
distribution densities and the linking functions. We find that, irrespective of the particular choices, 
the generation of scale-free networks is straightforward. We then derive the general conditions under 
which scale-free behavior appears. This model could then represent a possible explanation for the 
ubiquity and robustness of such structures. 



In the last few years, much attention has focused on the 
study of complex networks. A network is a mathemat- 
ical object consisting of a collection of vertices (nodes) 
connected by edges ( link s )L yj2j. Networks arise in many 
areas of science: biology La- Lll- LjI - social sciences [(jLllSl) 
Internet [EEIEIj WWWflifT etc.. where vertices and 
links can be for example, proteins and their mutual inter- 
action, individuals and sexual relationship |l3| . comput- 
ers and cable connections. Very interestingly the same 
non trivial statistical properties appear ubiquitously in 
all the above situations. A more traditional view, in- 
deed, is represented by the binomial model inspired to 
the random graph model of Erdos-Renyi [1J|. Here, each 
vertex has the same probability to connect to any other, 
resulting in a network whose degree, i.e. number of edges 
per vertex, has a binomial distribution. This is not the 
case of the above real data, where instead, the structure 
is self similar resulting in a scale-free (SF) probability 
distribution for the degree. More specifically, the degree 
k of the vertices, i.e. the number of links entering them, 
is distributed according to a power law P(k) oc k a with 
usually —3 < a < —2. 

In order to explain the occurrence of SF networks the 
ingredients of growth preferential attachment have been 
introduced 15|. The network increases the number of 
vertices with time, the newcomers tend to be connected 
with old vertices with large degree. Nevertheless, in some 
cases, we have the same SF properties without either 
growth of the system or preferential attachment mecha- 
nism. As an example, the finite set of protein interac- 
tions in a cell forms a self-similar network. This is done 
without growth of the system size and ignoring their re- 
ciprocal degree. Possibly, some external influence on in- 
trinsic properties like chemical affinity is instead driving 
the phenomenon. 

To take into account this new mechanism, the var ying 
fitness model has been introduced by Caldarelli et al. |16| . 
In this model, considering e.g. only undirected graphs, 
one extracts a real non-negative variable x (the hidden 



variable) for each vertex of the graph from a probability 
distribution density p[x). This variable x is the fitness 
of the vertex. Links between vertices are successively 
formed with a probability function f(x,y), a symmetric 
function of its arguments. 

A static simplified form of the vertex hidden variable 
model has been considered for only one particular case 
by Goh et al. |l7|. Bianconi et al. introduced a fit- 
ness mechanism coupled to the preferential attachment. 
In the paper of Caldarelli et al. 0], the onset of SF 
behavior is instead directly related only to the fitness 
presence of any kind. This behavior is also checked for 
different fitness probability distribution densities. Fol- 
lowing Ref. [16(, we present here an exhaustive study on 
the conditions needed in order to produce a SF network. 

The aim of this work is to provide some ingredients 
to generate SF networks with the vertex hidden variable 
model and to provide the analytic expressions for the 
functions p{x) and f(x, y) that define SF networks in 
three special cases. 

The fitness model can be easily generalized in order 
to have more than one fitness variable per vertex |l9| . 
In the following, we consider a single real variable x per 
vertex, with x > 0. As a probability distribution density 
function, p satisfies {p(x) > 0| J p[z)dz = 1}, while 
the linking probability < f(x,y) < 1. We define the 
primitive function of p(x), the probability distribution 
R{x) = J p(z)dz. Indicating the number of vertices in 
the graph with N, one has the vertex degree 



k{x) = N / f(x,z)p(z)dz. 



(1) 



Other quantities of interest are the average nearest neigh- 
bor connectivity (vertex degree correlation), 



K nn (x) 



N 
Jo 



f{x, z)k{z)p{z)dz, 



(2) 



expressing the average degree of vertices that are nearest 
neighbors of vertices with fitness x, and the clustering 



2 



coefficient (vertex transitivity) 

C(x) = N 



2 Jq%2& V)f(y, z )f( z > x)p{y)p{z)dydz 
k{x) 2 



, (3) 



that counts the fraction of nearest neighbors of vertices 
with fitness x that are also nearest neighbors each other. 
Eqs. d> ©j © are valid asymptotically when N ap- 
proaches infinity. Eqs. J3J), © were first derived in 
Ref. |23, and expressed in a different form. 
If k{x) is an invertible and increasing function of x then 
the probability distribution P{k) is given by 



P(k) = p(x(k)) ■ x'(k) 
or, as a function of x, 

p(Hx)) = 

k'(x) 



(4) 



(5) 



Since the degree probability is power-law distributed in 
most of the physical situations, we impose in Eq. (J5J 
P(k) = ck a with a £ K. The constant c is fixed by the 



the normalization condition P{k)dk = 1 



-s±i if a=f -1 



Af° + !-fc° + 1 



if a = —1 



(6) 



with fco = lim k(x). Note that fco = /3N for some < 

an— >+0 

f3 < 1, and c oc iV _ ( a +^. Eq. © becomes: 

ck'(x){k(x)) a = p(x). (7) 

By integrating Eq. (JJJ from to x we get the following 
non linear integral equation: 



k(x) 



if a = -1 



k e R(x ^ c 



(8) 



with fc(x) given by Eq. (JIJ. 

By multiplying both sides of Eq. (J8J by and in- 
tegrating from to 00 we get an analytic expression for 
the average vertex degree (k). This expression can be 
used to express k as a function of (k), so that the final 
expressions do depend on the physical quantity (k) only. 
For this purpose, the integral on the rhs is simply solved 
using the relation p{x)dx — dR{x). 

In the following we show an application of the model 
in three special cases of interest, comparing the analytic 
results with numerical simulations. It has to be noticed 
that once fixed TV, in order to compute the quantities 
P(k), K nn (k), C(fc), from ensemble statistics, we need 
to perform two different average procedures. Firstly, we 
should extract a {xi}i=i...N configuration with the dis- 
tribution density p{x) and keep it fixed, while creating 
ensemble elements using the linking probability f(x,y) 



and averaging at the end. Secondly, we should repeat 
the above procedure a sufficient number of times. We 
assume that for large enough N and ensemble elements, 
the procedure of first averaging with respect to the / can 
be skipped. 

Here we focus on two different problems: firstly, what 
we call direct problem, one assigns a distribution den- 
sity function p(x) and tries to find the linking probabil- 
ity function f(x,y); secondly, what we call inverse prob- 
lem, one assigns the linking probability function and tries 
to determine the fitness probability distribution density 
p(x). The inverse problem is by far more complex and 
interesting than the direct one. For instance, in the case 
of protein SF network by assuming a reasonable linking 
function, we can retrieve the probability density distri- 
bution of fitness (e.g. some basic property of the macro- 
molecules) . 

We start with the special case of f(x,y) — g(x)h(y) 
where both the direct and inverse problems can be ana- 
lytically solved. Because of the symmetry of /(x, y) with 
respect to its arguments one has g(x) = h(x), so that 
f(x,y) — g(x)g(y). Eq. Q becomes: 



k(x) = Ng(x) I g{z)p{z)dz 



(9) 



that substituted into Eq. ||SJ) gives equations in g and 
p. If one fixes a given function p(x), the equations in 
g(x) can be easily solved. Take for instance the second 
equation corresponding to a = —1. One gets: 

/>oo 

Ng(x)(g) = k e R ^'\ (g) = / g(z)p(z)dz. 



By multiplying left and right hand side by p{x) and in- 
tegrating from to 00, considering that p(x)dx — dR(x), 
we get: 



(g) = yJkoc(eV' - 1)/N. 
Finally, the solution reads: 



Nc{e 1 / C - 1) 



(10) 



This procedure is applicable for any value of a. Eq. i|l(J|) 
generates random networks with degree probability dis- 
tribution P{k) oc 1/fc. 

In order to test the result we take the choice reported in 
the caption of Fig. ^ We conclude that for any given 
p(x) there exists a function g{x) such that the network 
generated by p(x) and f{x,y) = g{x)g(y) is scale free 
with arbitrary real exponent. 

In this case both the average nearest neighbors con- 
nectivity and clustering coefficient are constant [2£j . Re- 
spectively: 
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FIG. 1: Vertex degree distribution generated by f{x,y) — 
g{x)g(y), a = -1, p(x) = e~ x , k = 0.1, N = 10 4 . The value 
of c is calculated from Eq. JHJ and the function g(x) from 
Eq. 11011 . The number of ensemble elements was 20. 



as it can be derived from Eq. @ and Eq. J3J). 

The inverse problem for f(x,y) — g{x)g(y) is solved 
by substituting Eq. © into Eq. {7J|: 

p(x) = cg'(x)g(xr(N(g)r+K 

Let us remark that the assumptions on k(x) forces g{x) 
to be non-decreasing with <?(oo) > <?(0) > 0. 

In the case a ^ —1, the normalization condition 
i?(oo) — 1 results in 

v ; 3(oo) a+1 - .g(0) Q+1 

By multiplying both sides of the previous equation by 
g(x) and integrating from to oo, one gets an expression 
for (g) that does not explicitly contain the function p(x) . 
This expression can be used to get the allowed value of 
the constant c: 



1 



9o +1 



a 



1 



a+2 

goo 



Q+l 



if a ^ —2, with g M = g(oo) and go = g{0). The particu- 
lar cases a = { — 1, —2} can be similarly treated. 

The case f(x,y) = f(x — y) is more complicated. In 
this case both the nearest neighbor connectivity and clus- 
tering coefficient depend on the fitness x and conversely 
on the degree k. We managed to solve this case in the par- 
ticular case of an exponentially distributed fitness. We 
indicate with F(x) the rhs of Eq. JSJ|. Thus Eq. JSJl be- 
comes: 



f(x - u)p(u)dx = F{x)/N. 



By changing the integration variable into z = x 
get: 



p{x-z)f{z)dz = F{x)/N 



u we 



FIG. 2: Degree distribution in the case f(x,y) — f(x — y), 
a = -3, p(x) = e~ x , k = 10, N = 10 4 , f(u) = F{u) + F'{u) 
with F(x) given by the lhs of Eq. © averaged 40 times. The 
value of c is calculated from Eq. JSJ. The inset shows the 
vertex degree correlation and transitivity as functions of the 
vertex degree. 




FIG. 3: Degree distribution in the case f(x,y) = f(x + y) and 
a = -2, p(x) = e~ x , k = 0.5, N = 10 4 , f{v) from Eq. itTBl . 
averaged 20 times. The value of c is calculated from Eq. @. 
The inset shows the vertex degree correlation and transitivity 
as functions of the vertex degree. 



that in the special case p{x) = e x becomes: 
e z f{z)dz = e x F{x)/N. 



By differentiating with respect to the variable x we finally 
obtain: 



f(x,y) 



F(x~y) + F'{x~y) 
N 



(13) 



In order to test the result we take the function and pa- 
rameter choice of Fig. [21 caption. 

The case f(x, y) = f(x + y) is analogous. Again, we 



4 



consider the special case p(x) = e x , getting now: 



f(x,y) 



F(x + y)- F'{x + y) 
N 



(14) 



The solution of Eq. © for a = —2 obtained via Eq. fTIjl 
reads, recalling that fco = j3N and using Eq. 



/(») 2/) = 



[1 - (p- 1 - l) e -( B +*)] s 



(15) 



Through Eq. (|15fl we clarify the assumption made 
in the original paper by Caldarelli et al. |16(, where 
f(x, y) = Q(x + y — z) with z = z(N). Note that 
now with the latter choice of f(x, y) one gets P(k) — 
Ne~ z k~ 2 that forces z to depend upon TV in order to get 
the correct normalization. The functional form of the 
z(N) was numerically guessed by Ref. |2l|- To test the 
result we take the parameters reported in the caption of 
Fig. El 

In these last two cases, both the nearest neighbor con- 
nectivity and clustering coefficient show non trivial k de- 
pendence. 

In conclusion we present a general procedure to repro- 
duce real SF networks with arbitrary vertex degree distri- 
bution densities. More specifically, we found that, given 
a fitness distribution density p(x), it is always possible 
to find a symmetric linking probability function f(x,y) 
such that the resulting random network is scale-free with 
a given real exponent. We give the recipe to find these 
linking functions, in three cases of interest. In order to 
allow the generation of networks even closer to the real 
data, it would be desirable to have control not only on 
the vertex degree distribution, but also on the vertex 
transitivity and vertex degree correlation, by solving si- 
multaneously Eqs. (J2J, 1(3}, JSJ. As a first step, the com- 
patibility of these three equations should be addressed, 
once the functions P(k), K nn (k), C(k) are given. The 
solution of this problem is certainly very hard and is left 
open for the future. The relative ease with which we ob- 
tain SF structures seems to be the key ingredient in order 
to explain the ubiquitous presence and robustness of the 
real data. 
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